{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "2f43c205",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/jerryjliu/llama_index/blob/main/docs/examples/objects/object_index.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6f062ba1-0049-472c-8300-64f83d405ffc",
   "metadata": {},
   "source": [
    "# The `ObjectIndex` Class\n",
    "\n",
    "The `ObjectIndex` class is one that allows for the indexing of arbitrary Python objects. As such, it is quite flexible and applicable to a wide-range of use cases. As examples:\n",
    "- [Use an `ObjectIndex` to index Tool objects to then be used by an agent.](https://docs.llamaindex.ai/en/stable/examples/agent/openai_agent_retrieval.html#building-an-object-index)\n",
    "- [Use an `ObjectIndex` to index a SQLTableSchema objects](https://docs.llamaindex.ai/en/stable/examples/index_structs/struct_indices/SQLIndexDemo.html#part-2-query-time-retrieval-of-tables-for-text-to-sql)\n",
    "\n",
    "To construct an `ObjectIndex`, we require an index as well as another abstraction, namely `ObjectNodeMapping`. This mapping, as its name suggests, provides the means to go between node and the associated object, and vice versa. Alternatively, there exists a `from_objects()` class method, that can conveniently construct an `ObjectIndex` from a set of objects.\n",
    "\n",
    "In this notebook, we'll quickly cover how you can build an `ObjectIndex` using a `SimpleObjectNodeMapping`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b46bc95b-e154-48e8-9475-350d2446e297",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import VectorStoreIndex\n",
    "from llama_index.core.objects import ObjectIndex, SimpleObjectNodeMapping\n",
    "\n",
    "# some really arbitrary objects\n",
    "obj1 = {\"input\": \"Hey, how's it going\"}\n",
    "obj2 = [\"a\", \"b\", \"c\", \"d\"]\n",
    "obj3 = \"llamaindex is an awesome library!\"\n",
    "arbitrary_objects = [obj1, obj2, obj3]\n",
    "\n",
    "# object-node mapping\n",
    "obj_node_mapping = SimpleObjectNodeMapping.from_objects(arbitrary_objects)\n",
    "nodes = obj_node_mapping.to_nodes(arbitrary_objects)\n",
    "\n",
    "# object index\n",
    "object_index = ObjectIndex(\n",
    "    index=VectorStoreIndex(nodes=nodes), object_node_mapping=obj_node_mapping\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bd9b90c3-4e72-46b9-b545-a816d4b9ed75",
   "metadata": {},
   "source": [
    "### As a retriever\n",
    "With the `object_index` in hand, we can use it as a retriever, to retrieve against the index objects."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c8ed16df-5ea3-47bf-81a9-d9917c31d48f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['llamaindex is an awesome library!']"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "object_retriever = object_index.as_retriever(similarity_top_k=1)\n",
    "object_retriever.retrieve(\"llamaindex\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0032d0e7-815d-414d-9fcc-384709b59484",
   "metadata": {},
   "source": [
    "## Persisting `ObjectIndex`\n",
    "\n",
    "When it comes to persisting the `ObjectIndex`, we have to handle both the index as well as the object-node mapping. Persisting the index is straightforward and can be handled by usual means (e.g., see this [guide](https://docs.llamaindex.ai/en/stable/module_guides/storing/save_load.html#persisting-loading-data)). However, it's a bit of a different story when it comes to persisting the `ObjectNodeMapping`. Since we're indexing aribtrary Python objects with the `ObjectIndex`, it may be the case (and perhaps more often than we'd like), that the arbitrary objects are not serializable. In those cases, you can persist the index, but the user would have to maintain a way to re-construct the `ObjectNodeMapping` to be able to re-construct the `ObjectIndex`. For convenience, there are the `persist` and `from_persist_dir` methods on the `ObjectIndex` that will attempt to persist and load a previously saved `ObjectIndex`, respectively."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "10053dbc-e9c2-4a54-9b04-a7c66af80860",
   "metadata": {},
   "source": [
    "### Happy example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4ad58419-35b5-4010-ae3f-42f96e6c7226",
   "metadata": {},
   "outputs": [],
   "source": [
    "# persist to disk (no path provided will persist to the default path ./storage)\n",
    "object_index.persist()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3aedfc02-a94b-4f0a-8e8b-a22bf76515fe",
   "metadata": {},
   "outputs": [],
   "source": [
    "# re-loading (no path provided will attempt to load from the default path ./storage)\n",
    "reloaded_object_index = ObjectIndex.from_persist_dir()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4ec0c2ba-ef26-41fb-b2ae-df946abac01c",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{7981070310142320670: {'input': \"Hey, how's it going\"},\n",
       " -5984737625581842527: ['a', 'b', 'c', 'd'],\n",
       " -8305186196625446821: 'llamaindex is an awesome library!'}"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reloaded_object_index._object_node_mapping.obj_node_mapping"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "34107c44-d51c-42a9-a802-fba67deba8d0",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{7981070310142320670: {'input': \"Hey, how's it going\"},\n",
       " -5984737625581842527: ['a', 'b', 'c', 'd'],\n",
       " -8305186196625446821: 'llamaindex is an awesome library!'}"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "object_index._object_node_mapping.obj_node_mapping"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8078b5ff-0047-4a0c-96ea-a9a768e060ae",
   "metadata": {},
   "source": [
    "### Example of when it doesn't work"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7b19b0c3-3bfb-4ed8-bcf0-d04615afbbd5",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.tools import FunctionTool\n",
    "from llama_index.core import SummaryIndex\n",
    "from llama_index.core.objects import SimpleToolNodeMapping\n",
    "\n",
    "\n",
    "def add(a: int, b: int) -> int:\n",
    "    \"\"\"Add two integers and returns the result integer\"\"\"\n",
    "    return a + b\n",
    "\n",
    "\n",
    "def multiply(a: int, b: int) -> int:\n",
    "    \"\"\"Multiple two integers and returns the result integer\"\"\"\n",
    "    return a * b\n",
    "\n",
    "\n",
    "multiply_tool = FunctionTool.from_defaults(fn=multiply)\n",
    "add_tool = FunctionTool.from_defaults(fn=add)\n",
    "\n",
    "object_mapping = SimpleToolNodeMapping.from_objects([add_tool, multiply_tool])\n",
    "object_index = ObjectIndex.from_objects(\n",
    "    [add_tool, multiply_tool], object_mapping\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "63c777a2-d73b-4916-96c4-794fe5ebcac5",
   "metadata": {},
   "outputs": [
    {
     "ename": "NotImplementedError",
     "evalue": "Subclasses should implement this!",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mNotImplementedError\u001b[0m                       Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[4], line 2\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[38;5;66;03m# trying to persist the object_mapping directly will raise an error\u001b[39;00m\n\u001b[0;32m----> 2\u001b[0m \u001b[43mobject_mapping\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mpersist\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n",
      "File \u001b[0;32m~/Projects/llama_index/llama_index/objects/tool_node_mapping.py:47\u001b[0m, in \u001b[0;36mBaseToolNodeMapping.persist\u001b[0;34m(self, persist_dir, obj_node_mapping_fname)\u001b[0m\n\u001b[1;32m     43\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mpersist\u001b[39m(\n\u001b[1;32m     44\u001b[0m     \u001b[38;5;28mself\u001b[39m, persist_dir: \u001b[38;5;28mstr\u001b[39m \u001b[38;5;241m=\u001b[39m \u001b[38;5;241m.\u001b[39m\u001b[38;5;241m.\u001b[39m\u001b[38;5;241m.\u001b[39m, obj_node_mapping_fname: \u001b[38;5;28mstr\u001b[39m \u001b[38;5;241m=\u001b[39m \u001b[38;5;241m.\u001b[39m\u001b[38;5;241m.\u001b[39m\u001b[38;5;241m.\u001b[39m\n\u001b[1;32m     45\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[1;32m     46\u001b[0m \u001b[38;5;250m    \u001b[39m\u001b[38;5;124;03m\"\"\"Persist objs.\"\"\"\u001b[39;00m\n\u001b[0;32m---> 47\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mNotImplementedError\u001b[39;00m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mSubclasses should implement this!\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n",
      "\u001b[0;31mNotImplementedError\u001b[0m: Subclasses should implement this!"
     ]
    }
   ],
   "source": [
    "# trying to persist the object_mapping directly will raise an error\n",
    "object_mapping.persist()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "de77bbca-9ba8-46f9-a66d-60cf1ce143ad",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/var/folders/0g/wd11bmkd791fz7hvgy1kqyp00000gn/T/ipykernel_77363/46708458.py:2: UserWarning: Unable to persist ObjectNodeMapping. You will need to reconstruct the same object node mapping to build this ObjectIndex\n",
      "  object_index.persist()\n"
     ]
    }
   ],
   "source": [
    "# try to persist the object index here will throw a Warning to the user\n",
    "object_index.persist()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ee002c84-fe00-43e5-b0cb-53f6fb547b13",
   "metadata": {},
   "source": [
    "**In this case, only the index has been persisted.** In order to re-construct the `ObjectIndex` as mentioned above, we will need to manually re-construct `ObjectNodeMapping` and supply that to the `ObjectIndex.from_persist_dir` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cbc69da5-d2f0-4feb-9527-436cd0d0a54b",
   "metadata": {},
   "outputs": [],
   "source": [
    "reloaded_object_index = ObjectIndex.from_persist_dir(\n",
    "    object_node_mapping=object_mapping  # without this, an error will be thrown\n",
    ")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "llama_index_3.10",
   "language": "python",
   "name": "llama_index_3.10"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
